Practice Problems (Week 4)

These practice problems (unlike the exercises) are not assessed. Use them to test your understanding, to challenge yourself, or whatever floats your boat.

1 Tries and Tribulations

Here's the Trie data structure from Assignment 1.

data Trie = Trie Bool [(Char,Trie)]
  deriving (Eq,Show)

Write a function delete :: Trie -> String -> Trie which does exactly what its name and type suggests.
Write a function size :: Trie -> Int which counts the number of words in a Trie.
Think about what properties delete has. Is it an involution? Does it have a left and/or right inverse? Is it idempotent? How would you test these propositions?
Think about whether your delete implementation preserves well-formedness and minimality. If it doesn't, can you make it do?

module Trie where

import Test.QuickCheck
import Data.List(nub,sort)

data Trie = Trie Bool [(Char,Trie)]
  deriving (Eq,Show)

{- `empty` represents an empty dictionary. -}
empty :: Trie
empty = Trie False []

{- `single xs` represents a dictionary consisting of only `xs`. -}
single :: String -> Trie
single []     = Trie True []
single (x:xs) = Trie False [(x,single xs)]

{- `insert t xs` inserts the word xs into the dictionary t. -}
insert :: String -> Trie -> Trie
insert [] (Trie _ ts)     = Trie True ts
insert (x:xs) (Trie b ts) =
  case span ((<x) . fst) ts of
    (ts1,[]) -> Trie b $ ts1 ++ [(x,single xs)]
    (ts1,(y,t):ts2)
      | x == y    -> Trie b $ ts1 ++ (x,insert xs t):ts2
      | otherwise -> Trie b $ ts1 ++ (x,single xs):(y,t):ts2

delete :: Trie -> String -> Trie
delete (Trie b ts) [] = Trie False ts
delete (Trie b ts) (x:xs) =
  case span ((< x) . fst) ts of
    (ts,(y,t):ts')
      | x == y ->
        Trie b $ trim [(x,delete t xs)] ++ ts'
    _ -> Trie b ts
  where
  {- trim is for maintaining minimality -}
  trim :: [(Char,Trie)] -> [(Char,Trie)]
  trim [(_,Trie False [])] = []
  trim xs = xs

size :: Trie -> Int
size (Trie b ts) = fromEnum b + sum (map (size . snd) ts)

{- Delete should not be an involution: deleting an element twice does
     not put it back again.

   We *would* expect delete to be idempotent: deleting a word twice
     should be the same as deleting it once.

   We might expect insert to be a left/right inverse of delete,
     but this unfortunately will not work out.

   First, since we have established that idempotence should hold, we
     can rule out left inverse without knowing any details about the
     definition of insert: if f is an idempotent function that is
     *not* the identity function, f cannot have a left inverse.

   The proof goes as follows. Since f is not the identity, there
   exists x such that f(x) ≠ x. By idempotence, f(f(x)) = x.  Any left
   inverse would need to satisfy the following equations:

     g(f(x)) = x        (1)
     g(f(f(x))) = f(x)  (2)

   However, by equational reasoning it follows that:

     x =          (eqn 1)
     g(f(x)) =    (idempotence of f)
     g(f(f(x))) = (eqn 2)
     f(x)

   This contradicts the assumption that f(x) ≠ x; QED.

   We can rule out the existence of a right inverse by a similar argument.

   In terms of delete and insert, intuitively this is because when we
     insert a word that was already there, or delete a word that
     wasn't there, we cannot insert or delete the same word to get
     the original trie back.
 -}

genTrie :: Int -> Gen Trie
genTrie 0 = pure $ Trie True []
genTrie n =
  Trie <$> arbitrary <*> (genKeys >>= genSubtries) where
  genKeys :: Gen [Char]
  genKeys = sort . nub <$> (resize 5 . listOf $ elements ['a'..'d'])
  genSubtries :: [Char] -> Gen [(Char,Trie)]
  genSubtries cs =
      zip cs <$> vectorOf (length cs) (genTrie . max 0 $ n-1-length cs)

instance Arbitrary Trie where
  arbitrary = sized $ genTrie . min 15
  shrink (Trie b ts) =
    Trie b <$> shrinkList (\(x,t) -> curry id x <$> shrink t) ts' where
    ts' = zip ['a'..'d'] $ snd <$> ts

{- To test these algebraic properties, we can use quickCheck. -}

{- The default generator for strings is unlikely to generate
   useful test cases because it will generate strings comprised
   of all unicode characters.

   Therefore, let's make our own generator that will only
   produce strings of a-d characters of length 3 or shorter.
 -}
data AD = AD {fromAD :: String} deriving (Eq,Show)

instance Arbitrary AD where
  arbitrary = AD <$> (resize 3 . listOf . elements) ['a'..'d']

propIdem :: Trie -> AD -> Bool
propIdem t (AD xs) = delete (delete t xs) xs == delete t xs

propLeftInverse :: Trie -> AD -> Bool
propLeftInverse t (AD xs) =
  insert xs (delete t xs) == t

propRightInverse :: Trie -> AD -> Bool
propRightInverse t (AD xs) =
  delete (insert xs t) xs == t

propInvolution :: Trie -> AD -> Bool
propInvolution t (AD xs) = delete (delete t xs) xs == t

{- Well-formedness can only be violated by duplicating
   entries, or having unsorted entries.
   Since we neither add elements, nor change the order
   of existing elements, we will maintain well-formedness.

   Minimality is maintained by the use of the trim function.
 -}

2 Binary Search Trees

Consider the following data structure for representing binary search trees.

data SearchTree a = Leaf | Node a (SearchTree a) (SearchTree a)
  deriving (Eq,Show)

Define an ADT interface with all the usual tree operations: creating an empty tree, inserting, searching, deleting, toList, and anything else you feel like adding.

These are meant to be search trees, where smaller elements go to the left and larger elements go to the right. Define a well-formedness predicate to capture this requirement, and write tests to make sure that your functions maintain well-formedness.

Note that this only makes sense for Ord instances, so your type signatures should include that constraint, e.g.:

insert :: Ord a => SearchTree a -> a -> SearchTree a

data SearchTree a = Leaf | Node a (SearchTree a) (SearchTree a)
  deriving (Eq,Show)

emptyTree :: SearchTree a
emptyTree = Leaf

{- The spec isn't entirely clear on what to do with duplicate
   elements; we decide to drop them.
 -}
treeInsert :: Ord a => SearchTree a -> a -> SearchTree a
treeInsert Leaf a = Node a Leaf Leaf
treeInsert (Node a l r) b
  | a < b     = Node a (treeInsert l b) r
  | a == b    = Node a l (treeInsert r b)
  | otherwise = Node a l r

treeLookup :: Ord a => SearchTree a -> a -> Bool
treeLookup Leaf _ = False
treeLookup (Node a l r) b
  | a < b     = treeLookup l b
  | a == b    = True
  | otherwise = treeLookup r b

{- This is the in-order walk.
   We could also have done pre- and post-order
   walks.
 -}
treeToList :: Ord a => SearchTree a -> [a]
treeToList Leaf = []
treeToList (Node a l r) = treeToList l ++ a:treeToList r

treeWellFormed :: Ord a => SearchTree a -> Bool
treeWellFormed Leaf = True
treeWellFormed (Node _ Leaf Leaf) = True
treeWellFormed (Node a (Node b l r) Leaf) =
  b < a && treeWellFormed (Node b l r)
treeWellFormed (Node a (Node b l r) (Node c l' r')) =
  b < a &&
  a < c &&
  treeWellFormed (Node b l r) &&
  treeWellFormed (Node c l' r')

{- Actually using these tests requires a generator
   for trees.

   It's probably best to write one that guarantees
   well-formedness, to minimise discarded tests.
 -}
prop_wf1 :: SearchTree Int -> Int -> Property
prop_wf1 t n =
  treeWellFormed t
  ==>
  treeWellFormed(treeInsert t n)

prop_wf2 :: SearchTree Int -> Int -> Property
prop_wf2 t n =
  treeWellFormed emptyTree

3 Faustian Programming

In a moment of personal weakness that you will later come to bitterly regret, you sign a contract with a notorious swindler to outsource some routine programming tasks. To uphold their part of the bargain, the swindler must implement a map function, which for disambiguation we'll call faustianMap :: (a -> b) -> [a] -> [b]. The contract stipulates that the faustianMap must satisfy the following algebraic properties.

length (faustianMap f xs) == length xs
faustianMap (f . g) xs == faustianMap f (faustianMap g xs)

Can you think of an implementation of faustianMap that the swindler could write, that satifies the letter of the contract, but which is not map?
Can you think of an implementation of faustianMap that does not satisfy the letter of the contract, but does so in a way that is extremely unlikely to be found by quickCheck?
Suppose we add one more property to our contract:
```
faustianMap id xs == xs
```
…but strengthen the type signature as follows:
```
faustianMap :: (a -> a) -> [a] -> [a]
```
This opens up a different way to swindle. Can you find it?
Extending the setting of Question 3, let's strengthen the type signature in a different way.
```
faustianMap :: (Int -> Int) -> [Int] -> [Int]
```
Now, there's a lot more ways we can swindle! Try to find some.
For an extra challenge: why was it necessary to to change the type signature of faustianMap to make the swindle still possible in Question 3?

There's a moral to this story, and not just "don't make deals with the devil". It seems that the more specific we make the type signature, the easier the swindler's life is going to be. So abstracting away from concrete types doesn't just make your code more generally applicable; it also makes it harder to get it wrong. Everybody wins!

-- Question 1
faustianMap :: (a -> b) -> [a] -> [b]
faustianMap f [] = []
faustianMap f (x:xs) = replicate (length xs + 1) (f x)

-- Question 2
faustianMap2 :: (a -> b) -> [a] -> [b]
faustianMap2 f l
  | length l == 2786781 -> []
  | otherwise = map f l

-- Question 3
faustianMap3 :: (a -> a) -> [a] -> [a]
faustianMap3 f xs = xs

-- Question 4
faustianMap3 :: (Int -> Int) -> [Int] -> [Int]
faustianMap3 f xs = replicate (length xs) 1

-- Question 5

{- Suppose we have a total function

     faust :: (a -> b) -> [a] -> [b]

   Such that for all xs,

     faust id xs = xs

   It follows that for all xs,

     map f xs = faust f xs

   The reason for this is related to properties of Haskell's type
   system. In order to produce a list of values of type `b`.  We
   cannot simply conjure our `b`s out of thin air: our only source of
   `b`s is the function `f`. But in order to apply `f`to produce
   values of type `b`, we need values of type `a`. And our only source
   of *those* is the elements of the list `a`.

   Therefore, we know that every element of `faust f xs` must be
   the result of applying `f` to some element in `xs`.

   To obtain that the n:th element of `faust x xs` must be
   the `f(xs!!n)`, observe that if this was not the case,
   we would have

     faust id xs ≠ xs

   for lists xs where all elements are distinct.
 -}